BM432 Data Visualisation Workshop

Morgan Feeney

University of Strathclyde

Leighton Pritchard

University of Strathclyde

2024-10-22

1. Introduction

Learning Objectives

  • You should be able to critically analyse how data is visualised
  • You should be able to judge a figure’s clarity and potential for misunderstanding
  • You should be able to identify potential sources of bias resulting from the visualisation
  • You should understand how to create effective figures for your own work

2. Bad and Good Data Visualisations

3. Critique of Published Scientific Figures

Example 1

Figure 1: Small molecules identified in previous HTS increase GCase activity.

Your Critical Analysis of Example 1

Your Critical Analysis of Example 1

Reasons for the grade
no reason
Dynamite plots and unclear comparisons.
test
No
Strengths: Combines schematic pathway diagrams, bar charts of lysosomal function, and microscopy images, which together convey the mechanistic link between PIKfyve modulation and lysosomal stress. Axes and statistical annotations are clean, and the use of consistent colour-coding across bar charts strengthens coherence.

Your Critical Analysis of Example 1

Suggestions for improvement
nuke from orbit
1D scatterplots. Line graph for B.
don’t know don’t care can’t be arsed
.
Weaknesses: The microscopy panels use pseudocolour overlays that are visually striking but may over-saturate the fluorescent signal, potentially exaggerating differences. Error bars are shown but sample sizes (n) are not always indicated on the plots.

Critique 1.1

Issue

Dynamite plot: the lower extent of these error bars is not visible.

Solution

Use boxplots or 1D scatterplots

Critique 1.2

Issue

Incomplete presentation of statistical comparisons, e.g is there a difference between A16 and A18 in (B)?

Solution

Present a table of statistical differences instead, or alongside the figure.

Critique 1.3

Issue

Distance between bars makes comparison awkward.

Solution

Place things to be compared by the reader next to each other where possible, to facilitate visual comparison.

Critique 1.4

Issue

The scale on the micrographs (especially in panel C) is too small to read easily.

Solution

Increase the size of the scale relative to the figure so it can be read.

Critique 1.5

Issue

A visual control (bright-field image) is absent.

Solution

In addition to showing DAPI and immunofluorescence images of the cells, include a bright-field micrograph of the cells (no fluorescence).

Critique 1.6

Issue

The colour scheme is misleading because saturation represents different data across figure panels (compare 1 \(\mu\)M A18 in B vs 5 \(\mu\)M in D, and 1 \(\mu\)M A18 in E).

Solution

Be consistent with the visual messaging of colour (hue, saturation, and luminance).

Critique 1.7

Issue

Too many comparisons in B clutter the figure and compress the space available for showing data.

Solution

Present a table of statistical differences instead, or alongside the figure, to reduce clutter.

Critique 1.8

Issue

\(y\)-axis scales vary between panels, so quantitative comparison between panels is difficult.

Solution

Use the same \(y\)-axis scale in all panels to facilitate direct comparison.

Our Critical Analysis

[I think it would be a nice way to wrap each figure up if we also gave each figure a mark? Or, we could put in a Slido poll here to ask them to revise their mark?]

Example 2

Figure 2: Endometriosis-associated macrophages exhibit significant transcriptomic heterogeneity.

Your Critical Analysis of Example 2

Your Critical Analysis of Example 2

Reasons for the grade
shan’t
UMAP is uninformative. Too complex. Horrible colour scheme.
shall I compare thee to a summer’s day, this figure is more lovely and more temperate
Amazing grace, how sweet a sound
Strengths: The figure integrates single-cell UMAP plots, cluster heatmaps, and gene-expression violin plots, giving a multidimensional view of macrophage heterogeneity. Each cell type is clearly colour-coded, and the legends are detailed. The use of gradient scales for expression intensity supports nuanced interpretation.

Your Critical Analysis of Example 2

Suggestions for improvement
Improve colour scheme in stacked bar chart. Don’t use UMAP.
test, how well does the form work if we add a very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very veryvery very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very loooooooooooooong answer
that saved a wretch like me
Weaknesses: The colour palette, though consistent, risks perceptual overlap between similar clusters—problematic for readers with colour-vision deficiencies. Violin plots are compact but occasionally under-labelled, making it hard to match specific genes to clusters. Some panels are densely packed, prioritising comprehensiveness over readability.

Critique 2.1

Issue

UMAP plots (B, E) are highly manipulable and clustering/placement does not necessarily reflect objective measures.

Solution

Be cautious of over-interpretation of UMAP and other nonlinear dimensionality reduction plots.

Critique 2.2

Issue

Unpleasant clashing (R default) colour choices in (C).

Solution

Use an appropriate colour palette.

Critique 2.3

Issue

The proportion plot in (C) does not give information on absolute number, only proportion/composition.

Solution

A proportional areas plot spanning all clusters would represent both absolute count per group and compositional information.

Critique 2.4

Issue

Heatmap text is too small to read comfortably.

Solution

Present heatmap as a separate figure, or reduce the amount of information in the image.

Critique 2.5

Issue

Heatmap is missing a scale (is purple high and yellow low, or vice versa?)

Solution

Add a scale bar.

Critique 2.6

Issue

The experimental summary in (A) does not indicate order of operations.

Solution

Use arrows to indicate order of steps and/or dataflow.

Critique 2.7

Issue

Text is too small in general to read comfortably.

Solution

Increase font size and/or break up the panels into individual figures.

Critique 2.8

Issue

The figure is crowded and the separation of panels is unclear.

Solution

Use whitespace to guide reader “flow” through the figure and reduce crowding -in particular, cramming (C) under the inset from (B) makes the figure feel very crowded. Alternatively break the panels into multiple figures.

Critique 2.9

Issue

The overall message of the figure is unclear.

Solution

If the intent is just that the macrophages exhibit transcriptional heterogeneity, then (D) is probably sufficient. If other messages are intended, then revise for clarity.

Example 3

Figure 3: A C. difficile mutant lacking all three YkuD-type Ldts (\(\Delta\)ldt1-3) exhibits wild-type growth, morphology, and 3-3 cross-linking.

Your Critical Analysis of Example 3

Your Critical Analysis of Example 3

Reasons for the grade
minging
What’s the point of A? Dynamite plots.
^\((*)(*)_^*^&\)%(&)()_&^%%&(
….explanations are for losers
Strengths: The authors use simple bar and line graphs to depict growth curves, cell length quantification, and cross-link percentages. The visual minimalism effectively conveys the “no significant difference” message. Microscopy images are clean, and the inclusion of scale bars and replicates adds credibility.

Your Critical Analysis of Example 3

Suggestions for improvement
Use 1D scatters for D and E
of mice and men
consistency is the hobgoblin of small minds
Weaknesses: Because the main conclusion is the absence of a difference, the data could be better supported by visualising effect sizes or confidence intervals instead of relying on “ns” labels. The axis labels could include raw units (e.g., OD600, µm) on all subplots for accessibility.

Critique 3.1

Issue

Dynamite plot: the lower extent of these error bars is not visible.

Solution

Use boxplots or 1D scatterplots

Critique 3.2

Issue

No complement of the triple mutant strain - missing data for an essential control?

Solution

Ensure that all controls are presented so that the experimental result can be interpreted properly.

Critique 3.3

Issue

Label obscures part of the image.

Solution

Relocate the label so data is not obscured.

Critique 3.4

Issue

Fluorescence wavelength not specified.

Solution

Include sufficient information that the reader does not need to refer to the main text. Overall a figure legend should provide enough detail for the figure to stand alone, but should not describe results/significance.

Critique 3.5

Issue

Colour schemes/palettes are inconsistent within panel A, and between panels B, D, and E.

Solution

Be consistent with the visual messaging of colour (hue, saturation, and luminance).

Critique 3.6

Issue

Excessive length of figure legend for panel A.

Solution

Break out panel A into its own figure.

Example 4

Figure 4: Functional characterization and overall structure of Rv1217c–1218c.

Your Critical Analysis of Example 4

Your Critical Analysis of Example 4

Reasons for the grade
burble
Very clear. Great structures - but dynamite plots!
I AM NOT SHOUTING YOU’RE SHOUTING
four score and seven years ago
Strengths: The figure integrates quantitative bar plots, Western blots, and cryo-EM structural renderings into a cohesive narrative. The ATPase assay bar chart uses distinct, interpretable colours, and the structural renderings make excellent use of domain-based colouring and orientation cues.

Your Critical Analysis of Example 4

Suggestions for improvement
1D scatterplots for B
twas brillig and the slithy toves did gyre and gimble in the wave
could also be expressed as 87
Weaknesses: The cryo-EM density map could benefit from a scale bar and annotation of local resolution variation, which are standard in structural biology visualisation. The bar charts might be improved by showing individual data points rather than only mean ± SD, to convey variability.

Critique 4.1

Issue

The rifampicin structure is purely decorative.

Solution

Remove unnecessary elements and avoid needless distractions in figures.

Critique 4.2

Issue

Dynamite plot: the lower extent of these error bars is not visible.

Solution

Use boxplots or 1D scatterplots

Critique 4.3

Issue

Colour scheme in (B) doesn’t seem purposeful and doesn’t add anything to the figure.

Solution

Use colour consistently to link or distinguish elements/groups/data, rather than for decoration.

Critique 4.4

Issue

The meaning of the grey regions in (C) and (D) is unclear.

Solution

The implied membrane in (C) and (D) could be labelled as such in the figure. By convention in microbiology, we would assume that the periplasm/extracellular space is “up” and the cytoplasm is “down” in (C) and (D) - but this should really be labelled to avoid any potential confusion.

Critique 4.5

Issue

Colours in (A) difficult to distinguish, especially with the red boxes which seem to skew blue closer to purple.

Solution

Instead of well images, a heatmap with clearer colour distinction could be presented.

Critique 4.6

Issue

Showing two cut-out bands is not an appropriate way to show Western blot data.

Solution

Present the complete blot.

Critique 4.7

Issue

Text in (D) is too small to read easily.

Solution

Break out panel into a separate figure or increase font size.

4. Summing Up

General Comments

  • Data presentation choices
  • Colour choices
  • Larger figures/graphs, more space between figures/graphs
  • Too much data per figure
    • Split into multiple figures
    • Remove unnecessary data (how do we define this?)
  • “The data is presented in a manner that would likely be inaccessible for people without prior experience. A move toward a more palatable/digestible format will facilitate better science communication in the future.”

Further Reading